Configuring Default Aggregations
Aggregations help summarize quantitative data across different groupings or time periods.
Understanding Aggregations
When to Use Aggregations
Aggregations are essential when you need to:
- Calculate sums of quantitative values (e.g., total sales)
- Find averages across groups (e.g., average temperature by city)
- Determine extreme values (e.g., maximum price per category)
- Count occurrences within groups
Configuring Default Aggregations
Setting Up in Schema
Use the defAggFn
attribute in your schema to specify default aggregation behavior:
const schema = [
{
name: "Horsepower",
type: "measure",
defAggFn: "avg",
},
{
name: "Maker",
type: "dimension",
},
];
info
When a default aggregation is specified, DataModel automatically applies it during any aggregation operation on that field.
Available Aggregation Functions
Access these via DataModel.AggregationFunctions
:
Function | Description | Use Case |
---|---|---|
SUM | Sum of all values | Total sales, total quantity |
AVG | Average of all values | Average price, mean temperature |
MAX | Maximum value | Highest score, peak value |
MIN | Minimum value | Lowest price, minimum rating |
COUNT | Count of all values | Number of transactions |
STD | Standard deviation | Data spread, variation analysis |
Practical Example
Let's create a visualization showing average horsepower by number of cylinders:
const { muze, getDataFromSearchQuery, env } = viz;
const DataModel = muze.DataModel;
// Sample data
const data = [
{
Name: "chevrolet chevelle malibu",
Maker: "chevrolet",
Horsepower: 130,
Cylinders: 6,
},
{
Name: "buick skylark 320",
Maker: "buick",
Horsepower: 165,
Cylinders: 8,
},
{
Name: "datsun pl510",
Maker: "datsun",
Horsepower: 88,
Cylinders: 4,
},
];
// Schema with default aggregation
const schema = [
{
name: "Cylinders",
type: "dimension",
},
{
name: "Horsepower",
type: "measure",
defAggFn: "avg", // Default aggregation set to average
},
];
// Create visualization
const parsedData = await DataModel.loadData(data, schema);
const dm = new DataModel(parsedData);
muze
.canvas()
.rows(["Horsepower"])
.columns(["Cylinders"])
.data(dm)
.mount("#chart");
Best Practices
-
Choose Appropriate Aggregations
- Consider the nature of your data
- Think about the insights you want to convey
- Use aggregations that make sense for your metrics
-
Schema Design
- Define aggregations for all relevant measure fields
- Keep aggregations consistent across related metrics
- Document your aggregation choices
-
Performance
- Use aggregations to reduce data points when dealing with large datasets
- Consider pre-aggregating data when possible
Common Pitfalls
- Applying sum aggregation to already averaged data
- Using count for normalized values
- Mixing different types of aggregations in the same visualization